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The distribution of human linguistic groups presents a number of interesting and non-trivial pat¬ 
terns. The distributions of the number of speakers per language and the area each group covers follow 
log-normal distributions, while population and area fulhll an allometric relationship. The topology 
of networks of spatial contacts between different linguistic groups has been recently characterized, 
showing atypical properties of the degree distribution and clustering, among others. Human de¬ 
mography, spatial conflicts, and the construction of networks of contacts between linguistic groups 
are mutually dependent processes. Here we introduce an adaptive network model that takes all of 
them into account and successfully reproduces, using only four model parameters, not only those 
features of linguistic groups already described in the literature, but also correlations between demo¬ 
graphic and topological properties uncovered in this work. Besides their relevance when modeling 
and understanding processes related to human biogeography, our adaptive network model admits 
a number of generalizations that broaden its scope and make it suitable to represent interactions 
between agents based on population dynamics and competition for space. 

PACS numbers: 87.23.Kg, 87.10.-e, 89.75.He 


I. INTRODUCTION 

Adaptive networks, where the dynamics of nodes is 
coupled to the dynamics of network links, have received 
considerable attention in the last decade [T] . This kind of 
networks represents not only a natural extension of mod¬ 
els where either dynamics on complex networks or the 
origin of non-trivial topology of networks itself had been 
the focus of attention, but is in its own right a field of 
interest. Indeed, in many natural systems node dynam¬ 
ics and network dynamics are intimately coupled, and 
their interplay captures important aspects that would be 
missed if both processes are not taken simultaneously 
into account. Examples are, among many others [5], 
neural networks, where neuron activity affects synaptic 
strength [3], ecological networks, where population dy¬ 
namics is coupled to food web structure [1], catalytic 
networks, where the appearance of auto-catalytic sets 
formed by sufficiently abundant chemical species is essen¬ 
tial for the maintenance of the system [5] , and where the 
explicit introduction of space leads to segregation of par¬ 
asitic species that may otherwise disrupt network struc¬ 
ture [6], or generic models where distinct populations of 
nodes separate when connection strength is allowed to 
vary [7]. 

The coupling between node and link dynamics is es¬ 
pecially relevant in social networks, where nodes are in¬ 
dividuals, companies, human groups or countries, e.g., 
and links represent social contacts of various kinds. In 
many of these networks, agents can actively change their 
interactions, thus causing a systematic modification of 
network topology. A well studied case is that of epi¬ 
demics, where susceptible individuals may suppress their 
links with infected neighbors, leading to networks assor- 
tative in degree and to first-order transitions between 


healthy and endemic states [8] and even to infection sup¬ 
pression [^. Similar situations hold in socioeconomic 
contexts, where adaptive networks display an interesting 
phenomenology that includes phase transitions and hys¬ 
teresis between dissimilar states of agents [10] and self¬ 
organization leading to broad wealth distributions [la¬ 
in a broader scenario, it has been shown that changes 
in the state of nodes coupled to rewiring of links sys¬ 
tematically causes network fragmentation [T^ . This fact 
seems to be enhanced by the spatial embedding of many 
social networks, which constraints interactions m and 
may induce the spatial separation of different socioeco¬ 
nomic classes mi. 

In this work we address the relationship between the 
demographic dynamics of human linguistic groups and 
the topology of their networks of spatial contacts. We 
present a model for an adaptive spatial network where 
neighboring relationships are determined by the growth 
of groups and their concomitant attempt to modify the 
total area they occupy. The model is based on two pre¬ 
vious and independent observations regarding the orga¬ 
nization of human groups. In |15j . a mean-field model 
was introduced to reproduce the population-area rela¬ 
tionship observed in human languages; the spatial struc¬ 
ture of groups did not play any essential role in explain¬ 
ing that observation, and thus was disregarded. As a 
consequence, that model cannot describe the complex 
topology of the network of contacts that was later un¬ 
covered m- Networks of contacts between linguistic 
groups reflect their spatial embedding and display a set of 
properties previously unseen in spatial networks. Among 
others, those networks have high intervality, a property 
shared with food webs [171 [18]. Inspired in this latter sys¬ 
tem, and in niche models which had successfully captured 
that property, a niche-like algorithm was proposed and 
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shown to recover most topological features of language 
networks |16] . 

Human demographic dynamics and the spread of popu¬ 
lations on space, which determines their inter-group con¬ 
tacts, are two coupled processes. As we report in this 
contribution, their mutual dependence is behind observed 
correlations between the population of a group and the 
number of spatial neighbors it has, and is needed to ex¬ 
plain the appearance of assortative properties in empir¬ 
ical language networks. Results from an adaptive net¬ 
work model that we here introduce are compared with 
several world regions and to the topology of the network 
of linguistic contacts in each of them. The paper is struc¬ 
tured as follows. In Section II we introduce the adaptive 
model for language networks. Previous relevant results 
are summarized for the sake of completeness and clarity. 
Section III reviews data on human linguistic groups as 
used in this study, particularly emphasizing the meaning 
of model parameters. In Section IV, we fit the model to 
empirical data and show how the adaptive model qual¬ 
itatively and quantitatively, in most cases, reproduces 
the population-area relationship, the degree and short¬ 
est path distributions of empirical language networks (in 
addition to other topological features), and non-trivial 
correlations between demography and topology. The pa¬ 
per finishes with an overall discussion and some proposals 
for model extensions and future research. 


II. ADAPTIVE MODEL FOR LANGUAGE 
NETWORKS 

The adaptive network model yields a dynamic network 
of interactions among groups arising from explicit demo¬ 
graphic dynamics and competition for space. As the size 
of groups varies, their neighboring relationships are mod- 
ihed and possible conflicts with different groups sharing 
boundaries may ensue. The precise example used is the 
development of human linguistic groups in the last thou¬ 
sand years. First, we define demographic dynamics fol¬ 
lowing current knowledge on the world population growth 
and suitable rules for inter-group contacts and conflicts. 
Second, the network of contacts between groups is up¬ 
dated in the light of changes in the areas they occupy. 


A. Demography and conflict rules 

The modeling of demographic dynamics is based 
on m- Dynamics relies on a stochastic multiplicative 
process of the form Pi{t + i) = ai{t)Pi{t) for the size of 
each population Pi, where the distribution of ai values 
is estimated from empirical data. This process describes 
the growth of linguistic groups m and reproduces the 
observation of a log-normal distribution of the number of 
speakers per language [20]. Subsequently, demographic 
changes are coupled to variations in the area over which 
groups are spread. That model was devised with the 


goal of explaining the population (P)-area (A) relation¬ 
ship observed in human linguistic groups, which follows 

Aoc p* [ig. 

Relevant model parameters have been derived from 
world population estimations, as follows. There were 
about Po = 3.1 X 10® humans in year 1000 |2T|, while 
in year 2000 the world population reached Pt = 5.7 x 
10® jUj. Assuming an exponential growth in the last 
ten centuries, an average annual growth rate a ~ 1.0029 
is obtained, and a dispersion aa = 0.096 can be asso¬ 
ciated to the process miig. The simplest distribution 
for the stochastic growth rate ai is a uniform distribution 
of average a and mean square dispersion ctq, usms). A 
constant number of languages in this time interval, equal 
to the current estimated linguistic diversity (6900 lan¬ 
guages) |2g, is considered. Though some languages may 
have appeared in the last millenium, and many others 
have disappeared, in this model we disregard language 
birth or extinction for the sake of simplicity. In a previous 
model that constitutes the basis for the demographic dy¬ 
namics here implemented, it has been numerically shown 
that those two processes did not affect the statistical re¬ 
sults m- As initial condition, we take uniform popula¬ 
tions (Pi(0) = 3.1 X 10®/6900), and areas Ai = 1, in arbi¬ 
trary units. Numerical simulations show that changes in 
the initial condition do not affect in a signihcant way the 
final distribution of group sizes (see also [13]). Dynam¬ 
ics are run for 1000 time steps to compare with current 
available data. In the scenario described, population dy¬ 
namics are defined so as to agree with empirical observa¬ 
tions. Therefore, parameters a and (Tq, implicitly contain 
information on all processes that may have potentially 
affected demographic changes in the last millenium (that 
is births and deaths, but also casualties due to wars or 
pandemic diseases, for example). This is also the reason 
to couple in a directed fashion population dynamics to ar¬ 
eas. Notice that the units of area remain undetermined 
to a multiplicative factor. 

The log-normal distributions of language sizes Pi [El 
I20j and areas A, m imply that the log-transformed vari¬ 
ables Pi = In Pi and ai = hi Ai for each linguistic group 
i follow Gaussian distributions. As a result, the stochas¬ 
tic multiplicative process in the original variables can be 
cast in the form of a stochastic additive process in pi and 
Qi |15j . The logarithmic number of individuals in a group 
therefore follows 

p^{t + 1) = p^{t) + P,{t), ( 1 ) 

where t is measured in years, and /3i is randomly drawn 
at each time step from a uniform distribution Il(P;e,ri) 
in the interval {e — rj, e + rj), 

n(/3; ^ [®(/^ - (e - - 0((e + V)- P)] , (2) 

with mean value e = —0.00186 and half-width p = 
0.169 obtained when the original multiplicative process 
is mapped to an additive one [El[21]- Similarly, the log- 
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arithmic area is assumed to obey 


ai{t + 1) = ai{t) + ^i{t). (3) 

The evolution of Pi and is coupled following two rules: 

1. The area covered by a group shrinks when its popu¬ 
lation decreases: if < 0, then ^i{t) is randomly 
drawn from a uniform interval [—r|/?i(t)|,0]. 

2. Increases in the population size lead to conflict be¬ 
tween group i and one of its neighbors on the net¬ 
work of contacts between groups (see below). A 
neighbor j of node i is chosen at random; if its 
growth rate is smaller than that of i, the area of i 
grows, and vice versa. Specifically, 

(a) If l3j{t) < Pi{t), ^i{t) is drawn from [0,wl3i(t)]; 

(b) If l3j{t) > ^i{t) is drawn from 

The spontaneous retreat parameter r measures to which 
extent log-areas spontaneously shrink when populations 
decrease. The outcome of conflicts is weighted through 
w, which determines the associated benefit for the popu¬ 
lation with the faster growth and is, in general, different 
from r. Actually, mean-field fits to actual values of pi 
and Oi at present have revealed a sub-linear relationship 
between population decrease and area reduction and a 
larger increase in areas as a result of conflicts, yielding 
w > 1 > r for the six world regions analyzed in Ha¬ 
lt remains to be seen whether this constraint remains in 
other world regions here analyzed, and whether the val¬ 
ues of w and r in the current adaptive network model 
significantly deviate from mean-field results. 


B. Network dynamics 

Space is effectively introduced in the form of a net¬ 
work of neighbors that coevolves with the demographic 
dynamics just described. The construction of the net¬ 
work is inspired in a static algorithm that used a given 
distribution of areas and contained the rules to construct 
a network of contacts between groups m- Now, instead, 
the network is continuously updated taking into consider¬ 
ation the area associated to each node, as obtained in the 
previous step. Neighboring relationships between nodes 
arise from an assumption on perimeter contact. Based on 
geometric constraints, it can be assumed that the perime¬ 
ter of node i is comparable to the sum of perimeters of 
its potential neighbors up to a multiplicative factor, 

E Ay\ (4) 

iGnn(i) 


where the perimeter overlap /i > 0 measures the average 
fraction of perimeter of each neighbor that is shared with 
node i. In general, fi —as defined in Eq. — is a 
node-dependent quantity, but for simplicity we assume an 


effective value all across the network, such that fi will be 
substituted by its network average / = N~^ fi^ where 
N is the number of nodes (languages) in the network. 

The network of contacts is generated in two steps: 

1. Directed network generation. Given the (arbitrarily 
ordered) set of areas {Ai(f), A 2 (t),..., Ajv(t)} at 
time step t, where Ai{t) = e°''^*\ we draw directed 
links between each node i and nodes at positions 
i ± l,i± 2 ,..., until the upper bound of the rhs 
of Eq. is first exceeded. The first neighbor jq 
is either i -|- 1 or z — I with equal probability, and 
subsequent nodes are chosen following the rules 


32n+l — 


i — n — 1, 
z -I- n -I- 1, 


jo = i + 1, 
jo = i- 1, 


and 


j2n = 


i + n+l, 
z — n — 1, 


jo = i + I, 
io = i - 1, 


(5) 


( 6 ) 


for n = 0,1, 2,... Periodic boundary conditions 
have been assumed when < 0 or > N. 

2. Transformation to an undirected network. Since 
spatial neighboring relationships are undirected, 
the previous network should be transformed to an 
undirected one. Links may be added or removed so 
as to guarantee the symmetry of the adjacency ma¬ 
trix. For this purpose we introduce a symmetriza- 
tion parameter 0 < q < 1. If a directed link i ^ j 
does not have a reverse counterpart, j yf i, we 
draw a uniformly distributed random number x in 
(0,1) and add the missing link j —>■ i to the net¬ 
work if X < q. If X > q, the original link i ^ j 
is removed. In any case, the relationship between 
i and j has been symmetrized after the process. 
Note that this process affects neighboring relation¬ 
ships as defined in Eq. Q, so it will be important 
to assess its eventual effect in the demographic and 
topological properties we aim at reproducing. 

The use of a one-dimensional array of areas to con¬ 
struct the network is analogous to the procedure used in 
ecological niche models, where a single variable suffices 
to reproduce most topological properties of food webs 
and where the explicit consideration of population dy¬ 
namics is not essential. In the case of networks of con¬ 
tacts between linguistic groups, their local structure was 
shown to be equivalent to that of almost regular, one¬ 
dimensional networks, with the area playing the role of 
the niche variable [TO] . 

Figure illustrates some important properties of the 
model just described. Fig. 0^) exemplifies the dynam¬ 
ics of logarithmic areas and populations pi, as well 
as the number of neighbors of group z -its degree kt- 
for 500 years. At the end of the simulation [Fig. 0b)], 
for t = 1000, the degree distribution p{k) is calculated. 
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FIG. 1: Model dynamics, (a) Time series corresponding to a 
realization of the model showing the dynamics of the natural 
logarithm of area and population size for a single linguistic 
group, and the number of neighboring languages it has. (b) 
Degree distribution obtained at the end of the realization for 
a system with 1000 interacting groups, (c) Relationship be¬ 
tween the number of neighbors k and the total area or popu¬ 
lation for the same ensemble. Parameters are r = 1, w = 1.5, 
/ = 0.2, and q = 0.3, and averages over 100 and 500 in¬ 
dependent realizations have been performed in (b) and (c), 
respectively. 


It presents a well-defined average valne and a significant 
tail to large k values. Finally, in Fig. [^c) we illustrate 
one of the quantities that could not be reproduced by 
models based only on demography m or on a niche-like 
algorithm to construct the network m, namely, the re¬ 
lationship between population or area and degree of a 
linguistic group. Among others, these quantities will be 
compared to those measured in current linguistic groups 
with the purpose of establishing whether the dynamical 
niche model is able to reproduce the observations and of 
determining the value of the model parameters that best 
fits the latter. Summarizing, the dynamical niche model 
is characterized by four parameters: the spontaneous re¬ 
treat r, the outcome of conflicts w, the average perimeter 
overlap /, and the symmetrization parameter q. 


III. DATA ON HUMAN LINGUISTIC GROUPS 

Data on linguistic groups stems from a collection by 
SIL International (http://www.ethnologue.com/) 
and a map developed by Global Mapping In¬ 
ternational (world LanguageMapping System, 
http://www.gmi.org/wlms/index.htm). A detailed 
description of the database appears in the Ethno- 
logue [21], from which information on 6900 extant 
languages, including their spatial distribution and 


number of speakers, can be found. Each language 
is characterized by a centroid, which is a point in 
latitude-longitude coordinates that represents its av¬ 
erage location. Centroids are the nodes of linguistic 
groups. Two nodes are linked if the groups they 
represent share spatial borders in any of the domains 
where a language is spoken (note that the speakers of a 
language may occupy disconnected domains, a situation 
that is relatively frequent). The interested reader can 
find details on network construction in [16j . 

In the present study, we are not taking into account 
languages which are widespread as a result of colo¬ 
nization. Languages such as English, Spanish or Por¬ 
tuguese in the Americas, or Mandarin Chinese in Asia, 
are in several senses outsiders: they percolate across con¬ 
tinental regions and act as hubs in networks of con¬ 
tacts between linguistic groups, enhancing the forma¬ 
tion of large connected components in language net¬ 
works |16j . The number of neighbors of widespread lan¬ 
guages (that is, their degree ki) is several-fold higher than 
that of other languages in the same network, thus sig¬ 
nificantly deviating from the bulk degree distribution. 
In this sense, widespread languages can be considered 
the Dragon Kings of languages |2Sj, and the dynam¬ 
ical processes that underlie their spread are different 
from the basic demographic dynamics implemented in 
our model. Widespread languages constitute a small 
fraction of world languages. The 50 largest languages 
(with 24 or more million speakers) represent only about 
0.7% of the data points here considered. The elimina¬ 
tion of widespread languages causes the fragmentation 
of otherwise connected networks in some world regions, 
remarkably in continental North America. This effect is 
not seen if, for instance, the largest languages are elimi¬ 
nated in the network corresponding to continental Africa, 
whose largest connected component remains essentially 
unchanged. 

The main properties of 12 networks of linguistic groups 
selected for the current study are reported in Table [Tj 
They correspond to five continental regions (Africa, Asia, 
Europe, and North and South America), though in the 
case of North America no large connected component can 
be identified: the three largest networks are found in or 
around Mexico, and named Mexl, Mex2 and Yucatan. 
In addition, we study the networks of Australia, New 
Guinea, Sulawesi and Luzon islands, as well as an ad¬ 
ditional small network found in the borders shared by 
Argentina, Bolivia, and Paraguay (ABP). 

An example of some model quantities and empirical 
properties of the continental Africa network are repre¬ 
sented in Fig. i In Fig. [^a) a part of the whole net¬ 
work is shown, emphasizing the area Ai of a given lin¬ 
guistic domain i and its neighboring relationships. As 
can be seen, the perimeter overlap depends on each pair 
of groups in contact. In this example, language i shares 
boundaries with eight different languages, so its has a 
degree ki = 8. In practice, fi is calculated from its defi¬ 
nition, fi ~ ^y^/X]jGnn(i) node, and then 
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CC 

N L z p 

/ 


C Africa 

2126 6154 0.87 0.63 

0.13 

0.11 


C Asia 1370 3967 0.65 0.72 0.10 0.10 

New Guinea 663 1543 0.62 0.42 0.22 0.16 

Australia 99 176 0.72 0.52 0.28 0.15 

Sulawesi 64 121 0.66 0.77 0.25 0.19 

Luzon 56 140 0.44 0.79 0.13 0.08 


C Europe 

231 

547 

0.60 

0.65 

0.13 

0.31 

Mexl 

68 

120 

0.82 

0.59 

0.32 

0.24 

Yucatan 

50 

111 

1.22 

0.61 

0.27 

0.57 

Mex2 

39 

71 

0.67 

0.73 

0.32 

0.24 

CS America 

234 

399 

0.40 

0.56 

0.28 

0.17 

ABP 

33 

59 

0.66 

0.76 

0.37 

0.42 


TABLE I: Largest connected components of networks of con¬ 
tacts between linguistic groups obtained for each continent. 
The number of nodes N and the number of links L are shown. 
Relevant quantities that the model intends to reproduce are 
the exponent z, the correlation p, and the average perimeter 
overlap /. The deviation of the distribution of fi values in 
each network is ct/. CC: Connected component; C Africa: 
continental Africa; C Asia: continental Asia; New Guinea, 
Sulawesi and Luzon are islands. C Europe: continental Eu¬ 
rope; Mexl: Mexico (1); Yucatan: Yucatan peninsula; Mex2: 
Mexico (2); CS America: continental South America; ABP: 
ABP borders. 


averaged over the whole network to obtain the value / 
reported in Table |l] For completeness, the last column of 
Table |l] summarizes, for each network in the dataset, the 
standard deviation cr/ of the distribution of fi values. 


A. Population-area relationship 


The relationship between the logarithm of the size of a 
linguistic group and the logarithm of the area over which 
its speakers are spread follows an allometric relationship 
that has a counterpart in ecology, where the abundance of 
a species and its home range are similarly related [Ml mi¬ 
lt has been shown that area a and population p fulfill 
a = zp + c for the whole world, where c is a constant, 
for six large continental regions, and also for groups of 
hunter-gatherers m- Since language sizes and areas fol¬ 
low log-normal distributions, the transformed logarith¬ 
mic variables a and p are well fitted by Gaussian distri¬ 
butions, and their joint distribution can be approximated 
by a bivariate normal distribution. This joint distribu¬ 
tion is characterized by z, which is the slope of the major 
ellipse axis of the scatter plot containing all {pi, a^) pairs, 
and a coefficient p that quantifies how correlated a and 
p are. 

Let us define, for each network, the average logarithmic 



FIG. 2: Properties of African largest connected component, 
with 2126 nodes, (a) Detail of the local structure of contacts 
between groups. Grey regions represent different domains 
where a language is spoken; one language might be spoken 
in disconnected domains. The centroid corresponds to the 
whole of a language, and therefore might fall even outside a 
particular domain; (b) Degree distribution; (c) Dependence of 
the logarithmic area a and population p of linguistic groups 
on the number of neighbors k in the network of spatial con¬ 
tacts. Error bars stand for the standard deviation of a and p 
values at each fixed k. 


area (a) and the average logarithmic population (p), 

(a) = (7) 

i i 


and the corresponding standard deviations 


= N 2^(a,-(a))2, al = N (8) 

i i 


The covariance matrix C of a and p is 

P<^aCrp 

paaCTp 

with pUa<Jp = N-'^ “ («))(p* - (p))- The eigenvec¬ 

tors of matrix C can be written as (1, z), {—z, 1), where 
z corresponds to the exponent relating both quantities. 
The value of p determines the degree of correlation be¬ 
tween a and p: The larger p, the more correlated the two 
variables are. Values of z and p obtained through this 
procedure for the networks here analyzed are reported in 
Table |II The interested reader can find example plots of 
this relationship for empirical data in m- 











































B. Topological properties 


IV. MODEL PARAMETERS: FITS TO 
EMPIRICAL DATA 
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Networks of contacts between linguistic groups hold a 
number of non-trivial topological properties [H]. They 
are an example of quasi-interval graphs, a property they 
share with food webs [miiiiEH]. The dependence be¬ 
tween the clustering coefficient and the linkage density 
2L/N reveals that language networks are akin to one¬ 
dimensional regular networks at the local level [H]. To¬ 
gether with intervality, this property supports the exis¬ 
tence of a configuration space of low dimensionality, and 
partly explains the success of a niche-like algorithm to 
account for several of the topological properties of lan¬ 
guage networks. Two additional properties that we will 
analyze and compare with model results are the shortest 
path length and the degree distributions. 

A representative example of the degree distribution 
p{k) is shown in Fig. ib). Most language networks 
analyzed so far present degree distributions compatible 
with log-normal functions |16j . In this work, one of our 
goals is to find out how likely it is that the adaptive 
network model generates degree distributions compati¬ 
ble with observations. The same applies to the distri¬ 
bution of shortest path lengths p{d). The latter have a 
complex shape that depends on the particular network, 
as will be shown. We have chosen these two distribu¬ 
tions because of their presumable relevance regarding 
inter-group dynamics. For example, the degree distribu¬ 
tion is related to the likelihood of entering into conflict 
with different linguistic (or cultural) groups as a result of 
shared boundaries, but may also affect linguistic evolu¬ 
tion due to frequent contacts with dissimilar languages. 
The shortest path length distribution may play a role in 
the dissemination of cultural innovations, under the rea¬ 
sonable assumption that intra-group spread of novelties 
is significantly faster than inter-group spread: the lesser 
intermediates, the faster the propagation. 


C. Dependence between demographic and 
topological variables 


Demographic and topological features of language net¬ 
works are not independent. For instance. Fig. [^c) illus¬ 
trates the empirical dependence of the logarithmic area a 
and population p on the degree k. This relation is qual¬ 
itatively similar to the dependence yielded by the adap¬ 
tive network model, see Fig. [l] In forthcoming sections 
we will make this relation quantitative by optimizing the 
values of model parameters to fit empirical observations. 
There is a final observation as yet unexplained, which is 
the appearance of population-population, area-area and 
degree-degree correlations between neighboring nodes in 
empirical networks (see below). 


With the aim of quantitatively reproducing demo¬ 
graphic and topological features of language groups and 
networks, we analyze which values of the model param¬ 
eters r, w, /, q best fit each of the 12 empirical net¬ 
works considered. Specific goals are to reproduce the 
empirical parameters z and p characterizing the (loga¬ 
rithmic) population-area relationship, and two topologi¬ 
cal features: the degree distribution and the distribution 
of shortest-path lengths. We will finally evaluate how the 
dynamical model with the so obtained parameters repro¬ 
duces the relationship between demographic and topolog¬ 
ical variables, as well as the appearance of correlations in 
node properties (area, population, and degree). 

Quantitative values of z and p obtained with the adap¬ 
tive network model fixing values of /* and q* do not differ 
substantially from those obtained in the mean-field ap¬ 
proximation used in [TS] , see Table The mean-field 
coupled dynamics of growth and conflict cannot be fit¬ 
ted to New Guinea island, contrary to what was ob¬ 
served in [TS]. Such discrepancy is due to the fact that 
we are only considering here connected networks, disre¬ 
garding isolated languages that were taken into account 
in m when calculating the empirical values of z and p. 
Additionally, in that reference the mean-field dynamics 
could not be fitted to the pooled set of North American 
languages, for which the correlation value was sensibly 
smaller than the rest {p = 0.40). Here the same phe¬ 
nomenon occurs for New Guinean connected languages 
{p = 0.42, see Table |T|. Gonsequently, the adaptive net¬ 
work model cannot be fitted to New Guinea network; 
hence, from now on, we will reduce our analysis to the 
remaining 11 networks. 

The relative difference between the surfaces 
z{r,w, f*,q*) and p{r,'w, f*,q*) and their mean- 
field counterparts zmf(?')W) and pmf(t'jIc) has been 
measured as 


riz = max 

i,3 


z{ri,Wj, f*,q*) - ZMF{ri,Wj) 

ZMF{ri,Wj) 


and 


TLp = max 


p{ri,Wj,f*,q*) - PMFiri,Wj) 
PMF{ri,Wj) 


( 10 ) 


( 11 ) 


for discretizations {ri,Wj) of the (r,w) parameter sub¬ 
space, with Ti e (0,1.5), Wj G (0.5,3), f* = 0.1 and 
q* = 0.1. Comparison with mean field values yields 
Uz = 0.12 and Up = 0.019, which implies that maxi¬ 
mum relative differences are around 12% and 2% for z 
and p, respectively, when network dynamics is explicitly 
considered. Since the mean-field model does not depend 
on / and q, this result suggests a weak dependence of the 
population-area relationship (through variables z and p) 
on the latter parameters. Deeper numerical explorations 
show that p is almost independent of / and q, whereas z 
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cc 

rMF 

WMF 

r 

W 

C Africa 

0.651 

2.23 

0.72 ± 0.02 

2.31 ± 0.02 

C Asia 

0.951 

1.54 

1.00 ±0.01 

1.56 ±0.01 

Australia 

0.125 

2.20 

0.16 ±0.05 

2.27 ± 0.06 

Sulawesi 

1.27 

1.25 

1.32 ±0.07 

1.23 ±0.08 

Luzon 

1.04 

0.735 

1.07 ±0.14 0.70 ±0.16 

C Europe 

0.603 

1.69 

0.64 ±0.02 

1.71 ± 0.02 

Mexl 

0.422 

2.24 

0.48 ±0.07 2.30 ±0.07 

Yucatan 

0.695 

2.79 

0.84 ±0.12 

2.92 ±0.13 

Mex2 

1.16 

1.16 

1.11 ±0.08 

1.49 ± 0.09 

CS America 0.195 

1.47 

0.20 ±0.01 

1.48 ± 0.02 

ABP 

1.20 

1.33 

1.27±0.10 

1.28 ± 0.11 


TABLE II: Optimal mean-field parameters (obtained with 
the model in m) and adaptive network model parameters r, 
w for f* = 0.1 and q* — 0.1. 


varies moderately in the region q « 0 , becoming almost 
constant for q > 0.3. 

Assuming that the parameter subspace (r, w) is mostly 
uncoupled to the subspace {f,q), our fits will be per¬ 
formed in two steps. First, we fit (r,w) to empirical 
values of z and p, keeping f* = 0.1 and q* = 0.1 fixed 
(see Table |T]). Second, we obtain estimates for / and q by 
imposing that simulated degree and shortest-path length 
distributions keep close (in a precise sense to be defined) 
to empirical distributions. Finally, we check that the es¬ 
timates of z and p still reproduce empirical values for the 
/ and q obtained. 


A. Fitting r and w to data 


As initial guesses for (r, w ), we use the mean-field val¬ 
ues (tmF)Wmf) reported in Table [H] Subsequently, the 
adaptive network model is simulated in square neighbor¬ 
hoods of (rMF,'!«MF), keeping f* = 0.1 and q* = 0.1 
fixed. We use model networks with the same sizes of 
empirical ones and average over 1000 model realizations. 
For each point (ri,Wj) of the grid, we estimate averages 
over realizations Zij = {z){ri,Wj) and pij = {p){ri,Wj), 
as well as the corresponding standard deviations crL and 


Let us define y = {z,py- and x = (r, In a local 
neighborhood of each point of the grid we expect an ap¬ 
proximate (two-dimensional) linear dependence between 
y and x. 


y«Mx-|-b (12) 

for a constant 2x2 (Jacobian) matrix M = {rriij) and 
a vector b = ( 6 i,& 2 )"'" to be determined. We estimate 
the required coefficients by means of a two-dimensional. 


weighted least-squares fit to simulated data, i.e., 

z,j=miir, + mi2Wj+bi, 

Pij = ’7^21?’* + rn22Wj -I- 62 . 

Fit’s weights are chosen in the usual way, as l/crf^, pro¬ 
vided that standard deviations for z and p are known. 
Note that the least-squares method provides estimates 
for standard errors of rriij and bi. 

Finally, r and w estimates come from 

x=M-i(y-b). (14) 


The errors of r and w have been calculated using stan¬ 
dard error propagation according to Eq. (14). Results are 
listed in Table |TTj where they can be compared to mean- 
field estimates. There are some quantitative differences 
regarding previous results in different world regions m- 
The value of the spontaneous retreat r is not always be¬ 
low 1 , implying that the reduction in area caused by a de¬ 
crease in the population size is not sub-linear in all cases. 
The four exceptions coincide with the smallest networks 
in our data set (Sulawesi, Luzon, Mex2 and ABP), so 
it cannot be discarded that this effect reflects a limited 
statistical power. The relationship r < w holds in most 
cases, with the exception of Sulawesi and Luzon islands. 


B. Fitting / and q to data 

Now we proceed with the fit of / and q to topologi¬ 
cal quantities. For each network’s size, we simulate 2000 
model realizations keeping r and w fixed and equal to 
the estimates previously obtained. The estimation of 
/ and q proceeds through two different approaches: (i) 
minimizing the separation between the empirical and the 
simulated degree distributions; (ii) jointly adjusting the 
degree and the shortest path length distributions. 


1. Optimization based on the degree distribution 

We determine / and q as the values that minimize the 
Hellinger distance between the empirical degree distri¬ 
bution and the simulated degree distribution. For two 
arbitrary discrete distributions g = (pi) and h = (hi), 
the Hellinger distance [22] is defined as 

[VlH - y^i) = II Vs-^ II2, 

(15) 

i.e., dn is proportional to the Euclidean norm of the 
difference of square-root vectors. We choose the pair 
(/, q) that minimizes dn for all networks here considered, 
where pk = Pe{k) is the empirical degree distribution, 
and hk = Ps{k) is the simulated degree distribution. 

Minimization has been carried out in two steps: first 
we perform a parameter screening in / S [0.05,1] and 














cc 

/ q 

du (degree) dn (path) 

C Africa 

0.14 ±0.01 0.30 ±0.01 

0.09 

0.16 

C Asia 

0.15 ±0.01 0.30 ±0.01 

0.12 

0.56 

Australia 

0.12 ±0.01 0.05 ±0.01 

0.08 

0.17 

Sulawesi 

0.20 ±0.01 0.20 ±0.01 

0.14 

0.17 

Luzon 

0.11 ±0.01 0.00 ±0.01 

0.13 

0.44 

C Europe 

0.15 ±0.01 0.18 ±0.02 

0.15 

0.25 

Mexl 

0.14 ±0.01 0.07 ±0.01 

0.15 

0.37 

Yucatan 

0.46 ±0.01 0.71 ±0.01 

0.21 

0.23 

Mex2 

0.49 ±0.02 0.84 ±0.03 

0.24 

0.19 

CS America 0.17 ± 0.01 0.13 ± 0.01 

0.12 

0.16 

ABP 

0.15 ±0.01 0.12 ±0.01 

0.14 

0.10 


CC 

/ Q 

du (degree) du (path) 

C Africa 

0.11 ±0.01 0.14 ±0.01 

0.13 

0.11 

C Asia 

0.09 ±0.01 0.25 ±0.01 

0.28 

0.06 

Australia 

0.11 ±0.01 0.01 ±0.01 

0.10 

0.07 

Sulawesi 

0.21 ±0.01 0.20 ±0.02 

0.15 

0.14 

Luzon 

0.13 ±0.01 0.29 ±0.06 

0.31 

0.15 

C Europe 

0.11 ±0.01 0.12 ±0.01 

0.19 

0.04 

Mexl 

0.67 ±0.03 0.85 ±0.03 

0.18 

0.19 

Yucatan 

0.11 ±0.03 0.06 ±0.04 

0.26 

0.08 

Mex2 

0.61 ±0.01 0.97 ±0.01 

0.26 

0.15 

CS America 0.16 ± 0.01 0.12 ± 0.02 

0.13 

0.11 

ABP 

0.16 ±0.01 0.11 ±0.01 

0.15 

0.06 


TABLE III: Fitted model parameters / and q obtained TABLE IV: Fitted model parameters / and g using the joint 
through minimization of the distance between simulated and minimization scheme, 
empirical degree distributions. 


q G [0,1]. This yields an estimate of the pair (/, q) that 
minimizes dp. Second, we use the estimation as initial 
guess for a standard algorithm of numerical minimiza¬ 
tion. 

For large values of / and small values of q, the adaptive 
network model yields disconnected graphs. This is due 
to the fact that large / values imply a small number of 
neighbors (therefore a low connectivity), while small q 
values tend to eliminate all unpaired links. Therefore, 
the number of nodes of the giant component can be well 
below the size of the empirical network. In order for the 
sizes of empirical and model networks to be comparable, 
the range of / and q values needs to be restricted. We 
require 


{N) > 0.9N„ (16) 

Ng being the empirical network size. Parameter val¬ 
ues yielding networks sizes outside of this region are not 
taken into account during minimization. This process 
yields the estimates listed in Table m for / and q, and 
Hellinger’s distance for the degree and the shortest-path 
length distributions. 


Results are listed in Table IV The restriction given by 
Eq. (16) also applies here. 

Note that too large values of / or too small values of q 
might cause a transition from a large connected compo¬ 
nent to a mostly disconnected ensemble of small networks 
when the network dynamics step is applied. The effect 
of decreasing / and/or increasing q from sufficiently low 
values (where nodes are disconnected) eventually causes 
a percolation transition comparable to that described 
in the standard Erdds-Renyi model [30] as the number 
of links increases. The fitted values we have obtained 
seem to balance such that the resulting networks are con¬ 
nected. For example, in Table [TTl| we observe that low / 
values correspond to mostly low q values (as in Luzon, 
Australia, continental Africa and Mexl), while high / 
values are associated to high q values (as in Yucatan and 


Mex2). This association is also observed in Table IV 


with interesting, consistent inversions of the correspon¬ 
dence seen in Mexl and Yucatan. 


C. Performance of the model 


2. Optimization based on the degree and shortest path 
length distributions 

We now apply a joint minimization of Hellinger’s dis¬ 
tance to the degree and shortest path length distribu¬ 
tions. The sum of both distances is used as the objective 
function to minimize, 

s(/,'?) = rfH(d,e)-f dH(g,h), (17) 

where d = (pe(d)) is the empirical distribution of 
shortest-path length and e = {Ps{d)) is its simulated 
counterpart, and similarly for the empirical and simu¬ 
lated degree distributions g = {pe{k)) and h = {ps{k)). 


1. Degree distributions 


Figure]^ shows the comparison between empirical and 
simulated degree distributions. Results in Fig. [^a) have 
been obtained through minimization of Hellinger’s dis¬ 
tance between degree distributions, whereas in Fig. |^b) 
the result of the joint minimization procedure is shown. 
In the former case, the agreement with empirical data 
is very good, even when the statistics of the original 
data (i.e., the network size, see Table |T]) is poor. The 
Hellinger distances for the degree distribution obtained 
with the joint minimization procedure (cf. Table IV) are, 
as expected, larger than the minimum values reported in 
Table Iml 
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FIG. 3: Empirical degree distributions (open circles) vs. 
simulated distributions (linked squares) averaged over 2000 
model realizations. Error bars correspond to standard devia¬ 
tions of model-simulated degree distributions, (a) Minimiza¬ 
tion based on the degree distribution, (b) Joint minimization. 
Ranges of axes are the same in all plots. 


2. Shortest path distributions 

Figure shows the results of the minimization of 
Hellinger’s distance for the degree distribution [Fig. Eta)l 
and for the degree and shortest path distribution jointly 
[Fig. gb)[. In the former case, the agreement between 
empirical and simulated distributions is poor in several 
cases (and especially in continental Asia), but there are 
some exceptions where the empirical distribution is rea¬ 
sonably reproduced, for example in ABP borders, conti¬ 
nental South America (the third largest network) or Su¬ 
lawesi island. The likelihood of the null hypothesis that 
the model can generate networks whose average shortest 


cc 

(de) p-value 

C Africa 

12.9 0.46 


C Asia 

9.5 

< 10"^ 

Australia 

6.1 

0.02 

Sulawesi 

5.2 

0.13 

Luzon 

2.6 

< 10"^ 


C Europe 

5.0 

0.13 

Mexl 

7.2 

< 10"^ 

Yucatan 

3.7 

0.14 

Mex2 

4.8 

0.18 

CS America 11.9 

0.33 

ABP 

3.4 

0.17 


TABLE V: Empirical average path lengths for language net¬ 
works and p-values of the null hypothesis corresponding to 
the adaptive network model with minimization of Hellinger’s 
distance on degree distributions. 


path length (d) is compatible with the empirical value 
(de) has been statistically tested using minimization of 
Hellinger’s distance based only on degree distributions. 
We have calculated the p-values of the null hypothesis, 
Pi'((d) > (de)), which are listed in Table [v| At a 99% 
confidence level, the null hypothesis is rejected only for 
continental Asia, Luzon and Mexl. Therefore, even if 
the distribution of shortest-path lengths is not explicitly 
considered to estimate model parameters, the adaptive 
network model is not statistically rejected to reproduce 
average path lengths in most empirical networks. 

The joint minimization significantly improves the fit to 
empirical shortest-path length distributions, yielding low 
values of Hellinger’s distance in most cases, see Table [TV| 
Though the joint fit to the degree and the shortest-path 
length distributions worsens the performance of the fit 
regarding the degree distribution, the overall fit to both 
distributions is significantly improved, as can be seen by 


and 


comparing the sum dq (degree) -I- dn (path) in Tables HI 


3. Consistency check 

Parameters r and w were obtained at constant values 
of the perimeter overlap / and the symmetrization value 
q, namely (/*, q*) = (0.1, 0.1). To test the consistency of 
our estimation procedure, we check now whether the use 
of final estimated values in Tables m and m substan¬ 
tially modify the performance of the model regarding z 
and p. Additional simulations for each (r, w, /, q) set of 
parameter values have been carried out, and averages for 
exponent z and correlation p have been calculated. The 
results are summarized in Table [VB and should be com¬ 
pared with empirical data in Table As can be seen, all 
empirical values lie within the error bars, thus validating 
a posteriori the methodology used. 
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d d 


FIG. 4: Empirical shortest-path length distributions (open circles) vs. simulated distributions (linked squares) averaged over 
2000 model realizations. Error bars correspond to standard deviations of model-simulated shortest-path length distributions, 
(a) Minimization based on the degree distribution, (b) Joint minimization. Ranges of axes are the same in all plots. 


cc 

2s Ps Zi pi 

C Africa 

0.91 ±0.03 0.63 ±0.01 0.88 ±0.02 0.63 ±0.01 


C Asia 0.66 ± 0.02 0.72 ± 0.01 0.67 ± 0.02 0.72 ± 0.01 

Australia 0.67 ± 0.12 0.52 ± 0.08 0.66 ± 0.13 0.52 ± 0.08 

Sulawesi 0.63 ± 0.07 0.75 ± 0.06 0.63 ± 0.07 0.75 ± 0.06 

Luzon 0.43 ± 0.05 0.78 ± 0.05 0.44 ± 0.05 0.79 ± 0.05 

C Europe 0.59 ± 0.05 0.66 ± 0.04 0.60 ± 0.05 0.65 ± 0.04 

Mexl 0.75 ± 0.14 0.58 ± 0.08 0.87 ± 0.15 0.60 ± 0.08 

Yucatan 1.34 ± 0.27 0.63 ± 0.09 1.16 ± 0.26 0.60 ± 0.09 

Mex2 _ 0.69 ±0.11 0.74 ±0.08 0.69 ±0.11 0.73 ±0.08 

CS America 0.38 ± 0.04 0.56 ± 0.04 0.38 ± 0.04 0.56 ± 0.05 

ABP 0.64 ± 0.10 0.76 ± 0.08 0.64 ± 0.10 0.75 ± 0.08 


TABLE VI: For each parameter set obtained through degree 
distribution minimization, 2000 model realizations yield the 
estimates Za and pa, and similarly for the joint minimization. 
Sub-index s stands for simulation results; upper-index j indi¬ 
cates joint minimization. Both estimates compare well with 
the empirical values reported in Table [I] 


4- Demographic and topological variables 

Figures and depict the correlation between aver¬ 
aged logarithmic areas and averaged logarithmic popula¬ 
tions, respectively, and degree k. Simulation data have 
been produced with the set of parameters obtained un¬ 
der both minimization schemes. For visualization pur¬ 
poses, model results have been displaced in the vertical 
axis through the addition of an arbitrary constant (re¬ 
call that area and population units are defined up to a 


constant). Except for the ABP network (which is the 
smallest one, with N = 33 nodes and therefore a poor 
statistical power), empirical logarithmic areas and pop¬ 
ulations monotonically increase with k. Though these 
functions are not explicitly considered to obtain model 
parameters, simulations reproduce with remarkable ac¬ 
curacy the empirical observations. 

The coupling between the stochastic process that de¬ 
termines population and areas and the update of the net¬ 
work of contacts under the perimeter overlap rule leads 
to the emergence of correlations between the area, popu¬ 
lation, and degree of neighboring nodes. These autocor¬ 
relations over the network are measured as 


Ti = 



i=i 




\k-j\=i 


(18) 


where Vj stands for any of the node properties aj, Pj or 
kj, and the second sum runs over the nodes which are 
at distance i from node j (vj are normalized to satisfy 
'^j '^3 = 1 so that autocorrelations for different variables 
are independent of their natural scales and can be mutu¬ 
ally compared). Distances are measured as shortest-path 
lengths between nodes. We normalize the product VjVk 
by Tijy, that is by the number of nodes that are at dis¬ 
tance i from node j. The 1/2 factor takes into account 
that all links are double-counted, since the sum runs over 
the whole network. 

Autocorrelations over the network have been plotted 
in Figure [Tj Correlations decay as the separation be¬ 
tween nodes increases, demonstrating that the network is 
assortative regarding area, population and node degree. 
These correlations cannot be observed if demography and 























































11 




0 5 10 15 





Yucatan 


Australia 




Mex1 


0 5 10 15 


(a) 
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k 



Mex1 


0 5 10 15 


(b) 
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k 


FIG. 5: Logarithmic area a vs. node’s degree k. Black 
circles correspond to empirical data; error bars are the stan¬ 
dard deviation of a for each degree. Linked squares are av¬ 
erages over 2000 realizations of the adaptive network model, 
(a) Minimization based on the degree distribution, (b) Joint 
minimization. Ranges of axes are the same in all plots. 


topology are uncoupled: Demographic dynamics without 
an underlying network of contacts correspond to a mean- 
field model without spatial structure; a static algorithm 
that reproduces network topology is unable to account for 
correlations in population sizes, a variable disregarded in 
the algorithm. Model results compare only qualitatively 
with empirical networks (see Fig.for an example). Fol¬ 
lowing an initial rapid decay, real networks exhibit an 
intermediate range of node separations where autocorre¬ 
lations are roughly constant. For large distances, how¬ 
ever, correlations decay as predicted by the mode. 


V. DISCUSSION AND CONCLUSIONS 


The coevolution of population demography and spatial 
contacts represents a new example of adaptive network 
in the social sciences. By means of a model coupling both 
processes, we have shown that previous known properties 
of this system are robustly reproduced: population-area 
relationships and language network topology. Besides, a 
number of features relating demography and topology, as 
well as certain assortative properties of those networks, 
are consistently obtained in the adaptive network ap¬ 
proach here introduced. Assortativity in node popula¬ 
tion, area and degree are by-products induced by sub¬ 
sequent cycles of population change and modification of 
topological neighborhoods which cannot be obtained in 
scenarios where these two processes are decoupled. Re¬ 
markably, the agreement between several empirical and 
simulated quantities is obtained through fits of just four 
model parameters. We believe this is due to the deep 
meaning of model rules, which are by themselves suf¬ 
ficient to explain the qualitative properties of linguis¬ 
tic groups and their associated spatial networks. There 
is an important exception that cannot be recovered by 
the model, in particular the population-area relationship, 
which fails to be reproduced already by the mean field 
approach m- New Guinea Island. Since this is an of¬ 
ten studied example of a region with an extremely high 
linguistic diversity, it is worth mentioning that the demo¬ 
graphic and conflict rules we implement do not suffice to 
yield the p value empirically measured. In general, the 
mean-field model is not able to reproduce the set of values 
P ^ 0-5 for the area-population correlation. Here the set 
of languages in the giant connected component for New 
Guinea (note that this is a subset of the New Guinean 
languages used in [15]) yields p = 0.42, which cannot be 
accounted for with the proposed dynamical rules. Sim¬ 
ilarly, the correlation for all North American languages 
annotated in the Ethnologue was not reproduced by the 
mean-field model proposed in m- 

The adaptive network model here presented admits a 
number of extensions. First, the introduction of addi¬ 
tional factors may make it more realistic. In this study, 
we have not considered the appearance of new languages 
or the death of existing ones. The origination of new lan¬ 
guages can be easily implemented by splitting an existing 
language. Following our rules, a new set of neighbors and 
an independent evolution of either population appear in a 
straight way. Death of languages can also be considered, 
for instance, by eliminating those groups whose popu¬ 
lation falls below a prescribed level (one individual, for 
instance). Quantities such as the average lifetime of lan¬ 
guages could be studied in this scenario. Second, model 
rules could be modified to consider ingredients such as 
language attractiveness or frequency of conflicts depen¬ 
dent on degree or population size |3T]. 

Language attractiveness is an important driver in the 
disappearance of minority languages [35], and could be 
implemented through a migration of population from one 
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FIG. 6: Logarithmic population p vs. node’s degree k. Black circles correspond to empirical data; error bars are the 
standard deviation of p for each degree. Linked squares are averages over 2000 realizations of the adaptive network model, (a) 
Minimization based on the degree distribution, (b) Joint minimization. Ranges of axes are the same in all plots. 



FIG. 7: Autocorrelation over the network, as defined by 
Eq. (181, as a function of the separation d between nodes, 
measured in the model for (a) log-area, (c) log-population, 
and (e) degree. Model parameters are r = 0.5, w = 2, / = 0.1, 
and q = 0.1 for networks with N = 500 nodes. Quantities 
have been averaged over 500 independent realizations. For 
the sake of comparison, we depict the autocorrelations for (b) 
log-area, (d) log-population, and (f) degree, obtained for the 
empirical network of continental Africa. Note the different 
range of the vertical axis in panel (f). 


language to any of its neighbors. In this way, popula¬ 
tion sizes would be modified through processes different 
from stochastic growth. Extreme versions of migration 
mechanisms might account for the growth of widespread 
languages [53], and perhaps explain the emergence of 


Dragon Kings in linguistic groups |2S]. In the model 
here used, every group enters into conflict with a neigh¬ 
bor once per time step. This rule could be modified to 
a likely more realistic version where conflict frequency 
is proportional to the number of links, and not to the 
number of nodes, and the outcome of conflicts could also 
consider the relative population of the involved parties. 
Also additional cultural markers, as political, linguistic 
or religious similarities, might modify the frequency and 
strength of conflicts. The results of these modifications 
are difficult to foresee. Third, the formation of links 
is now homogeneous and does not consider the struc¬ 
ture of human settlements in relation to the landscape. 
The introduction of preferential attachment depending 
on stylized landscape features might help explaining the 
appearance of a low dimensional niche space in language 
networks [HI |34| . 

The competition for areas between neighbouring popu¬ 
lations is a form of demographic conflict. In the scenario 
here devised, these conflicts do not affect population sizes 
and by definition occur at a characteristic time scale of 
the order of one year. There is a body of literature that 
has addressed the frequency and distribution of conflicts 
with the number of casualties in terrorist attacks [35] . 
wars |36| . or fatal quarrels in general m as main vari¬ 
able. Those events might have frequencies measured in 
days and have been often modeled as processes of frag¬ 
mentation and coalescence of groups [55]. It would be in¬ 
teresting to integrate the dynamical network perspective 
of our study with the fast evolution of groups dynamics 
and its effect on population sizes of these other conflict 
analyses with the goal of devising more complete models 
for cultural and political clashes. 
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Finally, we believe that the model could be applied 
to other model systems with analogous node and net¬ 
work dynamics. One such example is ecology, where an 
explicit competition for space of species occupying the 
same niche is known to occur. Further, the applicability 
of the model to that system is supported by a relationship 
between population sizes and ranges of occupation func¬ 
tionally equivalent to the population-area law followed by 
human linguistic groups. Demographic dynamics similar 
to those used here, perhaps with the addition of temporal 
biases to grow or decrease, might represent the dynamics 
of agents such as companies or religious groups, for exam¬ 
ple. Suitable modifications of how links are established 


might shed light on the distribution of group sizes and on 
the relevance of competition and inter-group conflicts. 
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